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1 IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

2 

3 Date: October 3, 2007 

4 

5 In re application of: 

6 Kelkar et al 

7 Serial No.: 10/629,448 

8 Filed: July 29, 2003 

9 Group Art Unit: 1631 

10 Examiner : Loria Clow 

11 FOR: Method and Program 

12 Product for Discovering 

13 Similar Gene Expression Profiles 
14 

15 

16 APPEAL BRIEF AND FEE IN SUPPORT OF APPEAL 

17 FROM THE PRIMARY EXAMINER TO THE BOARD OF APPEALS 

18 

19 Assistant Commissioner for Patents 

20 Washington DC 20231 
21 

22 Sir: 
23 

24 Appellants herewith submit an Appeal Brief in support of the 

25 appeal to the Board of Patent Appeals and Interferences from the 

26 decision dated May 16, 2007 of the Primary Examiner finally 

27 rejecting claims 1-6, 10-16 and 20. 
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FEE 

Please charge the fee of $510.00 set by 37 CFR § 41.20(b) (2) 
for filing a brief in support of an appeal to Deposit Account No. 
0 9-04 69. Charge any excess fee or deposit any overpayment to 
Deposit Account No. 09-0469. 

ORAL HEARING 

Appellants do not request an Oral Hearing. 

(1) Real Party in Interest 

The real party in interest in this appeal is International 
Business Machines Corporation, a New York corporation, assignee 
of the entire right, title and interest in the claimed invention. 

(2) Related Appeals and Interferences 

No other appeals or interferences are known to the 
Appellants, the Appellants' legal representative, or assignee 
that will directly affect or be directly affected by or have a 
bearing on the Board's decision in this appeal. 

(3) Status of Claims 

Claims 1-6, 10-16 and 20 are pending in this application. 

Claims 7-9 and 17-19 were canceled after restriction. 

The rejection of claims 10-16 and 20 under 35 U.S.C. 101 and 

the rejection of claims 1-6 under 35 U.S.C. 101 and 

for new matter is appealed. 

(4) Status of Amendments 

The amendment filed before final has been entered. 
The amendment filed after final has not been entered. 
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1 (5) Summary of Invention 

2 The present invention relates to a method and program 

3 product operating in a personal computer for clustering genes 

4 having potential functional similarity by a comparison of their 

5 time varying gene expression profiles. 

6 The method of the invention uses the time and intensity 

7 invariant correlation function of the IBM tool to find matches of 

8 gene expression profiles using both time and intensity 

9 information, which is better at detecting functional similarity 

10 than using intensity information alone. The output of 

11 Intelligent Miner is a data set of gene expression pairs with the 

12 match factor and number of subsets used to compare each pair. A 

13 threshold match factor is chosen and genes are listed in clusters 

14 by their match fractions. Genes are then removed from all except 

15 the cluster with the highest match fraction. Any genes not 

16 already in a cluster are added to a cluster which includes a gene 

17 that has a highest match fraction with the added gene. 
18 

19 (6) Issues 

20 

21 I. Whether output to a user is a required claim step in order 

22 to define an invention, that is a practical application which is 

23 useful, concrete and tangible. 
24 

25 II. Whether applicants' teaching of a personal computer with 

26 implicit, intrinsic and inherent output means in the 

27 specification support claims 1-6 without adding new matter. 
28 
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With respect to the final rejection of claims 1-6, 10-16 
and 20 under 35 U.S.C. 101, the rejected claims are grouped as 2 
groups . 

Claim 10 is representative of the group I and is related to 
Issue I. 

Claim 1 is representative of the group II and is related to 
Issue II 
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I (7) Argument 

2 

3 Issue I: Whether output to a user is a required claim step in 

4 order to define an invention, that is a practical application 

5 which is useful, concrete and tangible. 
6 

7 The Group I Claims 

8 

9 Appellants claim in exemplary claim 10: 

10 10. A program product having computer readable code stored on a 

11 recordable media for determining similarity between portions of 

12 gene expression profiles comprising: 

13 programmed means for processing a number of gene expression 

14 profiles with a similar sequences algorithm that is a time and 

15 intensity invariant correlation function to obtain a data set of 

16 gene expression pairs and a match fraction for each pair; 

17 programmed means for listing gene expression pairs in 

18 clusters by their match fractions; 

19 programmed means for removing a first gene from a cluster 

20 when the first gene is also in another cluster which has another 

21 gene with a higher match fraction with the first gene than any of 

22 the genes in the cluster have with the first gene; 

23 programmed means for repeating the removing step until all 

24 genes are listed in only one cluster. 
25 
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1 Applicants believe that the specification and claims indeed 

2 do describe a method and a program product that produce a result 

3 that has substantial and credible utility as required by MPEP 

4 2107 II and that the claims are limited to a narrow practical 

5 application in a computer related art. 
6 

7 The Examiner relies on the "New Interim Guidelines" to 

8 interpret the requirements of the Federal Courts under the 

9 current law to require claiming "output to a user". Applicants 

10 believe that the Examiner is mistaken and is applying an 

11 interpretation of the definition of the word tangible that is: 

12 1) narrower than appropriate under the current law and is 

13 2) narrower than required under the "New Guidelines". 
14 

15 1) The introduction to the "New Guidelines" states: 

16 "These Examination Guidelines ("Guidelines") are based on the USPTO's current understanding 

17 of the law and are believed to be fully consistent with binding precedent of the Supreme Court, 

18 the Federal Circuit and the Federal Circuit's predecessor courts. These Guidelines do not 

19 constitute substantive rulemaking and hence do not have the force and effect of law." 
20 

21 In following the "Guidelines", the Examiner appears to 

22 require separate interpretations of the words useful, concrete 

23 and tangible. 
24 

25 Applicants' attorney has found no basis in any of the 

26 Federal Circuit opinions using these words that imply that these 

27 terms are to have separate meanings. They appear to always be 

28 used together as synonyms for the concept of being useful and 

29 non-abstract. Applicants' attorney has requested that the 

30 Examiner provide a citation to a court' s requirement that these 

31 terms are part of a three pronged test if such is the case in 
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1 order to help applicants decide whether to appeal or request 

2 continued examination. No citation was provided. 

3 2) Even under the "Guidelines, the Examiners interpretation 

4 of the word tangible is unnecessarily narrow. 

5 The "Guidelines" at page 13 recite 

6 "Accordingly, a complete definition of the scope of 35 U.S.C. § 101, reflecting Congressional 

7 intent, is that any new and useful process, machine, manufacture or composition of matter under 

8 the sun that is made by man is the proper subject matter of a patent. The subject matter courts 

9 have found to be outside of, or exceptions to, the four statutory categories of invention is limited 

10 to abstract ideas, laws of nature and natural phenomena. While this is easily stated, determining 

1 1 whether an applicant is seeking to patent an abstract idea, a law of nature or a natural 

12 phenomenon has proven to be challenging." 
13 

14 Beginning at page 21 the "Guidelines" recite : 

15 "TANGIBLE RESULT" 

16 "The tangible requirement does not necessarily mean that a claim must either be tied to a particular 

17 machine or apparatus or must operate to change articles or materials to a different state or thing. 

18 However, the tangible requirement does require that the claim must recite more than a § 101 judicial 

19 exception, in that the process claim must set forth a practical application of that §101 judicial 

20 exception to produce a real-world result. Benson, 409 U.S. at 71-72, 175 USPQ at 676-77 (invention 

21 ineligible because had "no substantial practical application."). "[A]n application of a law of nature or 

22 mathematical formula to a ... process may well be deserving of patent protection." Diehr, 450 U.S. 

23 at 1 87, 209 USPQ at 8 (emphasis added); see also Corning, 56 U.S. (15 How.) at 268, 14 L.Ed. 683 

24 ("It is for the discovery or invention of some practical method or means of producing a beneficial 

25 result or effect, that a patent is granted . .,."). 
26 

27 In other words, the opposite meaning of "tangible" is "abstract . " 

28 The bare conversion of any binary data as in Gottschalk V. Benson 

29 or the bubble sort of any data as in "Warmerdam, 33 F.3d at 1360, 31 

30 USPQ2d at 1759 ("steps of 'locating* a medial axis, and 'creating' a bubble hierarchy . . . describe 
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1 nothing more than the manipulation of basic mathematical constructs, the paradigmatic 'abstract 

2 idea'")" recited at page 14 of the "Guidelines" are examples of the 

3 abstract. 
4 

5 Applicants' process does not convert or process just any data but 

6 is limited to useful concrete and non-abstract gene expression 

7 profiles in a data base of such profiles. Applicants' process is 

8 but one application of many possible applications of the 

9 mathematical steps involved in obtaining the useful result. 
10 

11 At page 17 of the "Guidelines we see: 

12 While abstract ideas, natural phenomena, and laws of nature are not eligible for patenting, methods 

13 and products employing abstract ideas, natural phenomena, and laws of nature to perform a real- 

14 world function may well be. In evaluating whether a claim meets the requirements of section 101, 

15 the claim must be considered as a whole to determine whether it is for a particular application of an 

16 abstrac t idea, natural phenomenon, or law of nature, rather than for the abstract idea, natural 

17 phenomenon, or law of nature itself. 
18 

19 As is clear from the specification and the claim limitations, 

20 applicants' process is limited to a particular practical 

21 application and is not an abstract idea, natural phenomenon or a 

22 law of nature . 
23 

24 The result is that all of the processed gene expression profiles 

25 are each listed in only one cluster . This result of applicants' 

26 claims is a very useful , repeatable and non-abstract result which 

27 is recognized by those skilled in the medical and computer arts 

28 to be of great value and useful , non-abstract and concrete 

29 finding of similar gene expression profiles . 
30 
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1 PRIOR ART 

2 Applicants note that their claims have not been rejected on prior 

3 art yet have been restricted on the ground that there were two 

4 groups of claims that reguired two fields of search. It is not 

5 apparent whether relevant prior art patents were considered by 

6 the Examiner while examining this application. It is believed 

7 that the "Guidelines" on page 10 are helpful in determining both 

8 the novelty of applicants' invention and the usefulness and non- 

9 abstract nature of applicants' the invention . 
10 

11 As evidenced by the references which applicants have attempted to 

12 incorporate by reference, but have acguiesced to the Examiners 

13 correct requirement to cancel, in addition to applicants 

14 teachings in the background art section of their specification, 

15 users in the medical profession find great value and usefulness 

16 in methods for finding similar gene expression profiles that are 

17 tangible and concrete. See for example US Patent 6,406,853 

18 abstract and claims 25, 26 and US Patent 6,436,642 column 26 

19 beginning at line 15. 
20 

21 It is believed that if the rejections under 35 U.S.C. 101 put 

22 forth in this application were appropriate, many of the relevant 

23 prior art patents in the appropriate fields of search would be 

24 found to be invalid. Since they were issued under the guidance 

25 of current statutory law and court cases, it must be that the 

26 rejections in this application are based upon excessively narrow 

27 and untenable interpretation of the current law. 
28 
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1 Issue II : Whether applicants' teaching of .a personal 

2 computer with implicit, intrinsic and inherent output means in 

3 the specification support claims 1-6 without adding new matter. 
4 

5 The Group II Claims 

6 Appellants claim in exemplary claim 1: 

7. 1. A method for determining similarity between portions of gene 

8 expression profiles in a computer comprising the steps of: 

9 processing a number of gene expression profiles with a 

10 similar sequences algorithm that is a time and intensity 

11 invariant correlation function to obtain a data set of gene 

12 expression profile pairs and a match fraction for each gene 

13 expression profile pair; 

14 listing gene expression profile pairs in clusters by their 

15 match fractions; 

16 removing a first gene expression profile from a cluster when 

17 another cluster has another gene expression profile with a higher 

18 match fraction with the first gene expression profile, unless the 

19 another gene expression profile requires a larger number of 

20 subsequences to achieve similarity with the first gene expression 

21 profile; 

22 repeating the removing step until all gene expression 

23 profiles are listed in only one cluster; 

24 providing output of the listing of clusters of gene 

25 expression profiles. 
26 
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1 Applicants' specification recites: The focal point of the 

2 preferred personal computer architecture comprises a processor 

3 51. The processor 51 is connected to a bus 52 which comprises a 

4 set of data lines, a set of address lines and a set of control 

5 lines. A plurality of I/O devices, memory and storage devices 

6 53-58 and 66 are connected to the bus 52 through separate 

7 adapters 59-64 and 67, respectively. For example, the display 54 

8 may be either a CRT or a flat panel display. 
9 

10 It is believed to be well known in the art as exemplified by 

11 prior art patents that users in the medical profession receive 

12 output from personal computer input/output devices such as 

13 applicants teach in their preferred embodiment. Again, 

14 applicants refer to US Patent 6,406,853 abstract and claims 25, 

15 26 and US Patent 6,436,642 column 26 beginning at line 15. 
16 

17 It is believed that material that is implicit, intrinsic, or 

18 inherent in the application as filed is not new matter. 
19 

20 In order to be usable by a user, a personal computer 

21 necessarily and constantly exhibits the function of input and 

22 output, and such function was recognized as such by those skilled 

23 in the art of using personal computers. Therefore applicants' 

24 addition of the step of providing such output to satisfy the 

25 Examiner's reading of the guidelines was not new matter but is 

26 supported in their specification by teachings that are implicit, 

27 intrinsic and inherent. 
28 
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1 Accordingly it is believed that the claims are clear, 

2 statutory and definite and are drawn to a novel and unobvious 

3 method and program product for clustering gene expression 

4 profiles which result is concrete, tangible and directly useful 

5 in drug selection and disease diagnosis. 
6 

7 Request for Relief 

8 

9 Wherefore, Appellants respectfully request that the 

10 rejection of pending claims 1-6, 10 - 16 and 20 be reversed. 

11 

12 
13 
14 

15 Date: October 3, 2007 
16 

17 IBM Corporation 

18 Intellectual Property Law 

19 MG90-201/1 

20 8501 IBM Drive 

21 Charlotte, NC 28262-8563 
22 



Respectfully submitted 




Karl 0. Hesse, Reg. 
Attorney for Appellants 



Land line 
Cell phone 
Fax: (704) 



(704) 895-8241 
(704) 724-1413 
594-8307 
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1 (8) Appendix 

2 Claims Involved in this Appeal 

3 

4 1. A method for determining similarity between portions of 

5 gene expression profiles in a computer comprising the steps of: 

6 processing a number of gene expression profiles with a 

7 similar sequences algorithm that is a time and intensity 

8 invariant correlation function to obtain a data set of gene 

9 expression profile pairs and a match fraction for each gene 

10 expression profile pair; 

11 listing gene expression profile pairs in clusters by their 

12 match fractions; 

13 removing a first gene expression profile from a cluster when 

14 another cluster has another gene expression profile with a higher 

15 match fraction with the first gene expression profile, unless the 

16 another gene expression profile requires a larger number of 

17 subsequences to achieve similarity with the first gene expression 

18 profile; 

19 repeating the removing step until all gene expression 

20 profiles are listed in only one cluster; 

21 providing output of the listing of clusters of gene 

22 expression profiles. 
23 
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1 2. A method for determining similarity between portions of 

2 gene expression profiles comprising the steps of: 

3 processing a number of gene expression profiles with a 

4 similar sequences algorithm that is a time and intensity 

5 invariant correlation function to obtain a data set of gene 

6 expression pairs and a match fraction for each pair; 

7 listing gene expression pairs in clusters by their match 

8 fractions; 

9 removing a first gene from a first cluster when the first 

10 gene is also in a second cluster which has another gene with a 

11 higher match fraction with the first gene than any of the genes 

12 in the first cluster have with the first gene , but; 

13 retaining the first gene in the first cluster and removing 

14 the first gene from the second cluster when the difference 

15 between the highest match fraction of the first gene with a gene 

16 in the first cluster and* the highest match fraction of the first 

17 gene with a gene in the second cluster is less than a minimum 

18 difference threshold and the number of subsequences represented 

19 in the similar gene pair having the highest match fraction in the 

20 first cluster is higher than the number of subsequences 

21 represented in the similar gene pair having the highest match 

22 fraction in the second cluster; 

23 repeating the removing step until all genes are listed in 

24 only one cluster; 

25 providing output of the listing of clusters of gene 

26 expression profiles. 
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1 3. A method of determining similarity between portions of 

2 gene expression profiles comprising the steps of: 

3 processing data embodying a number of gene expression 

4 profiles with a similar sequences algorithm in a computer that is 

5 a time and intensity invariant correlation function to obtain a 

6 data set of gene expression pairs and a match fraction for each 

7 pair; 

8 choosing a threshold match fraction; 

9 listing gene expression pairs in clusters by their match 

10 fractions above the threshold; 

11 adding each gene not already in a cluster to a cluster 

12 having another gene having a highest match fraction with the each 

13 gene without regard of the threshold; 

14 removing a first gene from a cluster when the first gene is 

15 also in another cluster which has another gene with a higher 

16 match fraction with the first gene than any of the genes in the 

17 cluster have with the first gene; 

18 repeating the removing step until all genes are listed in 

19 only one cluster; 

20 providing output of the listing of clusters of gene 

21 expression profiles. 
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1 4. A method for determining similarity between portions of 

2 gene expression profiles comprising the steps of: 

3 processing a number of gene expression profiles with a 

4 similar sequences algorithm that is a time and intensity 

5 invariant correlation function with a computer to obtain a data 

6 set of gene expression pairs and a match fraction for each pair; 

7 choosing a threshold match fraction; 

8 listing gene expression pairs in clusters by their match 

9 fractions above the threshold; 

10 adding each gene not already in a cluster to a cluster 

11 having another gene having a highest match fraction disregarding 

12 the threshold with the each gene; 

13 removing a first gene from a first cluster when the first 

14 gene is also in a second cluster which has another gene with a 

15 higher match fraction with the first gene than any of the genes 

16 in the first cluster have with the first gene, but; 

17 retaining the first gene in the first cluster and removing 

18 the first gene from the second cluster when the difference 

19 between the highest match fraction of the first gene with a gene 

20 in the first cluster and the highest match fraction of the first 

21 gene with a gene in the second cluster is less than a minimum 

22 difference threshold and the number of subsequences represented 

23 in the similar gene pair having the highest match fraction in the 

24 first cluster is higher than the number of subsequences 

25 represented in the similar gene pair having the highest match 

26 fraction in the second cluster; 

27 repeating the removing and retaining steps until all genes 

28 are listed in only one cluster; 

29 providing output of the listing of clusters of gene: 

30 expression profiles. 



Serial No. : 10/629,448 



16 



CHA920030003US1 



1 5. A method in a computer for determining similarity between 

2 genes comprising the steps of: 

3 listing genes to be compared in a data set by their gene 

4 expression profiles; 

5 processing the listed gene expression profiles with a 

6 similar sequences algorithm that is a time and intensity 

7 invariant correlation function to obtain a data set of gene 

8 expression pairs and a match fraction for each pair; 

9 choosing a threshold match fraction; 

10 creating a set G in which to list indices of genes accounted 

11 for; 

12 assigning genes i and j to a cluster a if they have a match 

13 fraction greater than the threshold; 

14 assigning gene k to the cluster a if it has a match fraction 

15 greater than the threshold with either gene i or gene j ; 

16 assigning genes k and 1 to a cluster b if they have a match 

17 fraction greater than the threshold and if both gene k and gene 1 

18 do not have match fractions above the threshold with either gene 

19 i or gene j ; 

20 repeating the assigning steps until all genes to be compared 

21 have been considered; 

22 removing a first gene from a cluster when another cluster 

23 has another gene with a higher match fraction with the first 

24 gene ; 

25 repeating the removing step until all genes are listed in 

26 only one cluster; 

27 providing output of the listing of clusters of gene 

28 expression profiles. 
29 
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1 6. A method in a computer for determining similarity between 

2 genes comprising the steps of: 

3 listing genes to be compared in a data set by their gene 

4 expression profiles; 

5 processing the listed gene expression profiles with a 

6 similar sequences algorithm that is a time and intensity 

7 invariant correlation function to obtain a data set of gene 

8 expression pairs and a match fraction for each pair; 

9 choosing a threshold match fraction; 

10 creating a set G in which to list indices of genes accounted 

11 for; 

12 assigning genes i and j to cluster 1 if they have a match 

13 fraction greater than the threshold; 

14 assigning gene k to cluster 1 if it has a match fraction 

15 greater than the threshold with either gene i or gene j ; 

16 assigning genes k and 1 to cluster 2 if they have a match 

17 fraction greater than the threshold and if both gene k and gene 1 

18 do not have match fractions above the threshold with either gene 

19 i or gene j ; 

20 removing a first gene from a cluster when another cluster 

21 has another gene with a higher match fraction with the first 

22 gene, unless the another gene requires a larger number of 

23 subsequences to achieve similarity with the first gene; 

24 repeating the removing step until all genes are listed in 

25 only one cluster; 

26 providing output of the listing of clusters of gene 

27 expression profiles. 
28 
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1 10. A program product having computer readable code stored 

2 on a recordable media for determining similarity between portions 

3 of gene expression profiles comprising: 

4 programmed means for processing a number of gene expression 

5 profiles with a similar sequences algorithm that is a time and 

6 intensity invariant correlation function to obtain a data set of 

7 gene expression pairs and a match fraction for each pair; 

8 programmed means for listing gene expression pairs in 

9 clusters by their match fractions; 

10 programmed means for removing a first gene from a cluster . 

11 when the first gene is also in another cluster which has another 

12 gene with a higher match fraction with the first gene than any of 

13 the genes in the cluster have with the first gene; 

14 programmed means for repeating the removing step until all 

15 genes are listed in only one cluster. 
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1 11. A program product having computer readable code stored 

2 on a recordable media for determining similarity between portions 

3 of gene expression profiles using output from a similar sequences 

4 algorithm that is a time and intensity invariant correlation 

5 function comprising: 

6 programmed means for providing a gene expression profile 

7 data set as input to programmed means embodying a similar 

8 sequences algorithm that is a time and intensity invariant 

9 correlation function to obtain a data set of gene expression 

10 pairs and a match fraction for each pair as output from the 

11 programmed means embodying a similar sequences algorithm; 

12 programmed means for listing the gene expression pairs in 

13 clusters by their match fractions; 

14 programmed means for removing a first gene from a cluster 

15 when the first gene is also in another cluster which has another 

16 gene with a higher match fraction with the first gene than any of 

17 the genes in the cluster have with the first gene; 

18 programmed means for repeating the removing step until all 

19 genes are listed in only one cluster. 
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1 12. A program product having computer readable code stored 

2 on a recordable media for determining similarity between portions 

3 of gene expression profiles comprising the steps of: 

4 programmed means for processing a number of gene expression 

5 profiles with a similar sequences algorithm that is a time and 

6 intensity invariant correlation function to obtain a data set of 

7 gene expression pairs and a match fraction for each pair; 

8 programmed means for listing gene expression pairs in 

9 clusters by their match fractions; 

10 programmed means for removing a first gene from a first 

11 cluster when the first gene is also in a second cluster which has 

12 another gene with a higher match fraction with the first gene 

13 than any of the genes in the first cluster have with the first 

14 gene, but; 

15 programmed means for retaining the first gene in the first 

16 cluster and removing the first gene from the second cluster when 

17 the difference between the highest match fraction of the first 

18 gene with a gene in the first cluster and the highest match 

19 fraction of the first gene with a gene in the second cluster is 

20 less than a minimum difference threshold and the number of 

21 subsequences represented in the similar gene pair having the 

22 highest match fraction in the first cluster is higher than the 

23 number of subsequences represented in the similar gene pair 

24 having the highest match fraction in the second cluster; 

25 programmed means for repeating the removing step until all 

26 genes are listed in only one cluster. 
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1 13. A program product having computer readable code stored 

2 on a recordable media for determining similarity between portions 

3 of gene expression profiles comprising the steps of: 

4 programmed means for processing a number of gene expression 

5 profiles with a similar sequences algorithm that is a time and 

6 intensity invariant correlation function to obtain a data set of 

7 gene expression pairs and a match fraction for each pair; 

8 programmed means for choosing a threshold match fraction; 

9 programmed means for listing gene expression pairs in 

10 clusters by their match fractions above the threshold; 

11 programmed means for adding each gene not already in a 

12 cluster to a cluster having another gene having a highest match 

13 fraction with the each gene without regard of the threshold; 

14 programmed means for removing a first gene from a cluster 

15 when the first gene is also in another cluster which has another 

16 gene with a higher match fraction with the first gene than any of 

17 the genes in the cluster have with the first gene; 

18 programmed means for repeating the removing step until all 

19 genes are listed in only one cluster. 
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1 14. A program product having computer readable code stored 

2 on a recordable media for determining similarity between portions 

3 of gene expression profiles comprising the steps of: 

4 programmed means for processing a number of gene expression 

5 profiles with a similar sequences algorithm that is a time and 

6 intensity invariant correlation function to obtain a data set of 

7 gene expression pairs and a match fraction for each pair; 

8 programmed means for choosing a threshold match fraction; 

9 programmed means for listing gene expression pairs in 

10 clusters by their match fractions above the threshold; 

11 programmed means for adding each gene not already in a 

12 cluster to a cluster having another gene having a highest match 

13 fraction disregarding the threshold with the each gene; 

14 programmed means for removing a first gene from a first 

15 cluster when the first gene is also in a second cluster which has 

16 another gene with a higher match fraction with the first gene 

17 than any of the genes in the first cluster have with the first 

18 gene, but; 

19 programmed means for retaining the first gene in the first 

20 cluster and removing the first gene from the second cluster when 

21 the difference between the highest match fraction of the first 

22 gene with a gene in the first cluster and the highest match 

23 fraction of the first gene with a gene in the second cluster is 

24 less than a minimum difference threshold and the number of 

25 subsequences represented in the similar gene pair having the 

26 highest match fraction in the first cluster is higher than the 

27 number of subsequences represented in the similar gene pair 

28 having the highest match fraction in the second cluster; 

29 programmed means for repeating the removing and retaining 

30 steps until all genes are listed in only one cluster. 
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1 15. A program product having computer readable code stored 

2 on a recordable media for determining similarity between genes 

3 comprising the steps of: 

4 programmed means for listing genes to be compared by their 

5 gene expression profiles ; 

6 programmed means for processing the listed gene expression 

7 profiles with a similar seguences algorithm that is a time and 

8 intensity invariant correlation function to obtain a data set of 

9 gene expression pairs and a match fraction for each pair; 

10 programmed means for choosing a threshold match fraction; 

11 programmed means for creating a null set G(0) to hold genes 

12 accounted for; 

13 programmed means for assigning genes i and j to cluster 1 if 

14 they have a match fraction greater than the threshold; 

15 programmed means for assigning gene k to cluster 1 if it has 

16 a match fraction greater than the threshold with either gene i or 

17 gene j ; 

18 programmed means for assigning genes k and 1 to cluster 2 if 

19 they have a match fraction greater than the threshold and if both 

20 gene k and gene 1 do not have match fractions above the threshold 

21 with either gene i or gene j ; 

22 programmed means for removing a first gene from a cluster 

23 when another cluster has another gene with a higher match 

24 fraction with the first gene; 

25 programmed means for repeating the removing step until all 

26 genes are listed in only one cluster. 
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1 16. A program product having computer readable code stored 

2 on a recordable media for determining similarity between genes 

3 comprising the steps of: 

4 programmed means for listing genes to be compared by their 

5 gene expression profiles; 

6 programmed means for processing the listed gene expression 

7 profiles with a similar sequences algorithm that is a time and 

8 intensity invariant correlation function to obtain a data set of 

9 gene expression pairs and a match fraction for each pair; 

10 programmed means for choosing a threshold match fraction; 

11 programmed means for creating a null set G ( 0 ) to hold genes 

12 accounted for; 

13 programmed means for assigning genes i and j to cluster 1 if 

14 they have a match fraction greater than the threshold; 

15 programmed means for assigning gene k to cluster 1 if it has 

16 a match fraction greater than the threshold with either gene i or 

17 gene j ; 

18 programmed means for assigning genes k and 1 to cluster 2 if 

19 they have a match fraction greater than the threshold and if both 

20 gene k and gene 1 do not have match fractions above the threshold 

21 with either gene i or gene j ; 

22 programmed means for removing a first gene from a cluster 

23 when another cluster has another gene with a higher match 

24 fraction with the first gene, unless the another gene requires a 

25 larger number of subsequences to achieve similarity with the 

26 first gene; 

27 programmed means for repeating the removing step until all 

28 genes are listed in only one cluster. 
29 
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1 20. In a method of determining similarity between portions 

2 of gene expression profiles which includes processing a number of 

3 gene expression profiles using a computer with a similar 

4 sequences algorithm that is a time and intensity invariant 

5 correlation function to obtain a data set of gene expression 

6 pairs and a match fraction for each pair, the improvement 

7 comprising the steps of: 

8 listing gene expression pairs in clusters by their match 

9 fractions; 

10 removing a first gene from a cluster when another cluster 

11 has another gene with a higher match fraction with the first 

12 gene, unless the another gene requires a larger number of 

13 subsequences to achieve similarity with the first gene; 

14 repeating the removing step until all genes are listed in 

15 only one cluster; 

16 providing output of the listing of clusters of gene 

17 expression profiles. 

18 
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